Team, Visitors, External Collaborators
Overall Objectives
Research Program
Highlights of the Year
New Software and Platforms
New Results
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: Research Program

Data interlinking with link keys

Vast amounts of rdf data are made available on the web by various institutions providing overlapping information. To be fully exploited, different representations of the same object across various data sets, often using different ontologies, have to be identified. When different vocabularies are used for describing data, it is necessary to identify the concepts they define. This task is called ontology matching and its result is an alignment A, i.e. a set of correspondences 〈e,r,e'βŒͺ relating entities e and e' of two different ontologies by a particular relation r (which may be equivalence, subsumption, disjointness, etc.) [4].

At the data level, data interlinking is the process of generating links identifying the same resource described in two data sets. Parallel to ontology matching, from two datasets (d and d') it generates a link set, L made of pairs of resource identifier.

We have introduced link keys [4], [1] which extend database keys in a way which is more adapted to rdf and deals with two data sets instead of a single relation. An example of a link key expression is:

{ 〈 π–Ίπ—Žπ—π–Ύπ—Žπ—‹ , π–Όπ—‹π–Ύπ–Ίπ—π—ˆπ—‹ βŒͺ } { 〈 𝗍𝗂𝗍𝗋𝖾 , 𝗍𝗂𝗍𝗅𝖾 βŒͺ } l i n k k e y 〈 𝖫𝗂𝗏𝗋𝖾 , π–‘π—ˆπ—ˆπ—„ βŒͺ

stating that whenever an instance of the class 𝖫𝗂𝗏𝗋𝖾 has the same values for the property π–Ίπ—Žπ—π–Ύπ—Žπ—‹ as an instance of class π–‘π—ˆπ—ˆπ—„ has for the property π–Όπ—‹π–Ύπ–Ίπ—π—ˆπ—‹ and they share at least one value for their property 𝗍𝗂𝗍𝗋𝖾 and 𝗍𝗂𝗍𝗅𝖾, then they denote the same entity. More precisely, a link key is a structure 〈Keq,Kin,CβŒͺ such that:

Such a link key holds if and only if for any pair of resources belonging to the classes in correspondence such that the values of their property in Keq are pairwise equal and the values of those in Kin pairwise intersect, the resources are the same. Link keys can then be used for finding equal individuals across two data sets and generating the corresponding π—ˆπ—π—…:π—Œπ–Ίπ—†π–Ύπ– π—Œ links. Link keys take into account the non functionality of rdf data and have to deal with non literal values. In particular, they may use arbitrary properties and class expressions. This renders their discovery and use difficult.